Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: transcode in stats #14418

Closed
wants to merge 4 commits into from

Conversation

guyfischman
Copy link

Adds transcoded videos storage usage to the server stats page, both for users and server total.

Mentioned in #14093.

@danieldietzler danieldietzler changed the title Transcode in stats feat: transcode in stats Nov 30, 2024
Copy link
Member

@danieldietzler danieldietzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not even sure if we want this tbh. To me this sounds somewhat counterintuitive. When thinking about quota I think about how much data they are allowed to upload

@@ -117,6 +117,10 @@ export class UserRepository implements IUserRepository {
'usageVideos',
)
.addSelect('users.quotaSizeInBytes', 'quotaSizeInBytes')
.addSelect(
`ARRAY_AGG(assets.encodedVideoPath) FILTER (WHERE assets.encodedVideoPath IS NOT NULL)`,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is gonna be veeeery slow

Comment on lines +181 to +187
vi.mock('fs', async () => {
const actual = (await vi.importActual<typeof fs>('fs'))!;
return {
...actual,
statSync: vi.fn().mockReturnValue({ size: 500 }),
};
});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now how you should be doing that. You should be mocking repo methods if you have to

}

try {
const stats = fs.statSync(path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure we already have a repo method for that. If we don't, we should add one

Copy link
Member

@danieldietzler danieldietzler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicked the wrong button oof

Copy link
Contributor

@mertalev mertalev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not fully convinced we need this. There is an elegance to only counting the original files - it aligns with how cloud solutions count usage, is independent from the particular settings of the admin and the immich release, and leads to a more intuitive quota for the user. This PR only changes the visible stats, but this is also confusing since there's a discrepancy between what is actually included in the quota and what is displayed.

Even if we come to an agreement there, the particular approach in this PR is too bolted-on and would need to be designed in a better way that doesn't require scanning the file system and encompasses other generated files like thumbnails.

}

try {
const stats = fs.statSync(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is terribly slow and will scale poorly as the number of videos increases, both in fetching every single path from the DB at once and in stat'ing them like this.

@guyfischman
Copy link
Author

I'm not even sure if we want this tbh. To me this sounds somewhat counterintuitive. When thinking about quota I think about how much data they are allowed to upload
I'm not fully convinced we need this. There is an elegance to only counting the original files - it aligns with how cloud solutions count usage, is independent from the particular settings of the admin and the immich release, and leads to a more intuitive quota for the user. This PR only changes the visible stats, but this is also confusing since there's a discrepancy between what is actually included in the quota and what is displayed.

I agree as regards the user's expectation. My change is only for the admin panel (at least that's how I intended it).

I feel like this is gonna be veeeery slow
This is terribly slow and will scale poorly as the number of videos increases, both in fetching every single path from the DB at once and in stat'ing them like this.

I agree, and now don't know why I did it this way to begin with. I think I didn't see the transcoded filesize in the db schema and wanted something that just worked now. I guess a more reasonable solution is to have the transcoding job update the asset table with the new file's size so it can be queried directly.

Even if we come to an agreement there, the particular approach in this PR is too bolted-on and would need to be designed in a better way that doesn't require scanning the file system and encompasses other generated files like thumbnails.

Happy to include other things like thumbnails, database, and immich code itself tbh. I'd like the admin panel to just show everything immich is using. Transcoding just seemed like the biggest chunk.

@mertalev
Copy link
Contributor

mertalev commented Dec 1, 2024

You're right - the file size for transcodes etc. isn't stored anywhere. I think it'd be reasonable to add a file size column in the asset_files table and move encodedVideoPath from assets over to this table. You could change the associated jobs to include file size when updating the DB (sharp's toFile provides this info and I believe fluent-ffmpeg does as well, so you shouldn't need separate stat calls). Getting this info would then just be a query on asset_files without touching the file system.

The only limitation of this is that it won't retroactively count files generated before this change, but I think this is okay as long as it's noted somewhere.

@alextran1502
Copy link
Contributor

Hello, we have some incoming changes to the database that relate to files in the database. It will be cleaner to use that mechanism to report these stats. Hence, I am closing this PR for now. Sorry, and thank you for the contribution; for future reference, please contact us first on Discord if you plan on opening a larger PR.

@guyfischman
Copy link
Author

Hello, we have some incoming changes to the database that relate to files in the database. It will be cleaner to use that mechanism to report these stats. Hence, I am closing this PR for now. Sorry, and thank you for the contribution; for future reference, please contact us first on Discord if you plan on opening a larger PR.

Thanks - FTR I did contact on discord first :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants